Structured Image Classification from Conditional Random Field with Deep Class Embedding
نویسندگان
چکیده
This paper presents a novel deep learning architecture to classify structured objects in datasets with a large number of visually similar categories. Our model extends the CRF objective function to a nonlinear form, by factorizing the pairwise potential matrix, to learn neighboring-class embedding. The embedding and the classifier are jointly trained to optimize this highly nonlinear CRF objective function. The non-linear model is trained on object-level samples, which is much faster and more accurate than the standard sequence-level training of the linear model. This model overcomes the difficulties of existing CRF methods to learn the contextual relationships thoroughly when there is a large number of classes and the data is sparse. The performance of the proposed method is illustrated on a huge dataset that contains images of retail-store product displays, taken in varying settings and viewpoints, and shows significantly improved results compared to linear CRF modeling and sequence-level training.
منابع مشابه
Deep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning
Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...
متن کاملLearning in the Deep-Structured Conditional Random Fields
We have proposed the deep-structured conditional random fields (CRFs) for sequential labeling and classification recently. The core of this model is its deep structure and its discriminative nature. This paper outlines the learning strategies and algorithms we have developed for the deep-structured CRFs, with a focus on the new strategy that combines the layer-wise unsupervised pre-training usi...
متن کاملRODRIGUEZ-SERRANO, PERRONNIN: LABEL EMBEDDING FOR TEXT RECOGNITION 1 Label embedding for text recognition
The standard approach to recognizing text in images consists in first classifying local image regions into candidate characters and then combining them with high-level word models such as conditional random fields (CRF). This paper explores a new paradigm that departs from this bottom-up view. We propose to embed word labels and word images into a common Euclidean space. Given a word image to b...
متن کاملHierarchical Conditional Random Field for Multi-class Image Classification
Multi-class image classification has made significant advances in recent years through the combination of local and global features. This paper proposes a novel approach called hierarchical conditional random field (HCRF) that explicitly models region adjacency graph and region hierarchy graph structure of an image. This allows to set up a joint and hierarchical model of local and global discri...
متن کاملGeodesic pixel neighborhoods for multi-class image segmentation
Multi-class image segmentation is a complex problem that poses several challenges: developing better classifiers, designing more discriminative features, finding efficient optimization techniques and modeling the relations between image pixels in different image regions. In this paper we focus on the last one. A common way to address the problem of structured prediction is to model it as a Cond...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1705.07420 شماره
صفحات -
تاریخ انتشار 2017